home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
HPAVC
/
HPAVC CD-ROM.iso
/
BICSF.ZIP
/
BICSF-info
next >
Wrap
INI File
|
1993-05-04
|
26KB
|
572 lines
[Slightly edited from a mail I received --Guido]
From: Stephen Travis Pope
To: Guido van Rossum
Subject: FAQ: Audio File Formats--The High End
Dear Mr. van Rossum,
Hoe gaat't? (I worked in Amsterdam for a while and hou van Holland.)
INTRODUCTION
I recently came across InternetTalkRadio while working at the Swedish
Institute for Computer Science, and read with great interest your
document on audio file formats. I find this a very valuable service to
the community and have one question and one contribution. Maybe I
should make the contribution first.
I have been involved in computer music and DSP since the early 1970's,
and have used more sound file formats than I care to remember (well,
actually I can't remember several of them). While your document treats
in detail the requirements of, and formats used in, telecommunications
and personal computer-based musical applications, I think it would
profit from more detail about the high-end formats and sound file
systems used in multi-channel computer music production. I will attempt
to provide you with the information I'm aware of below, with the
assumption that you may edit it according to your needs if you choose
to include or mention it in future editions of your FAQ.
HISTORY
In your list of "self-describing file formats" you mention the "IRCAM"
sound file system. This software has now been superseded by the
so-called "BICSF" (for Berkeley/IRCAM/CARL Sound File system) software
release. I include the standard document describing BICSF as an
Appendix to this letter. More recently, there has been an effort at
Princeton (Prof. Paul Lansky) and Stanford (myself) to standardize
several extensions to BICSF, which I'll outline below.
During the late 1970's and early 1980's, several sites developed
UNIX-based sound file systems for use in computer music. These early
systems generally included real changes to the UNIX file system, so
that separate disks or disk partitions were used for sound storage.
(Many still feel this is a good idea.)
The "root" of most of this work is the "csound" file system (first
released around 1980) (not to be confused with the MIT programming
language of the same name--which it predates), developed by D. Gareth
Loy at the Computer Audio Research Lab (CARL) at UC San Diego. It is a
real-time, high-throughput sound file system that ran on DEC VAX and
PDP-11 computers before the advent of the Berkeley file system. Csound
is part of the CARL music software distribution. This package also
includes the cmusic language (a simple C-based Music V descendent
written by F. Richard Moore), and many other tools such as vocoders and
configurable reverberators. The CARL software distribution is still
available for a small license fee, and now runs on Sun, NeXT, SGI, and
various other UNIX hardware. The CARL software is documented profusely
in Dick Moore's book "The Elements of Computer Music" (see references).
Robert Gross (then at UCB, now at Sun), based his cylinder-contiguous
sound file system (CCSS) on this. Robert took it with him when he moved
to Paris to work at IRCAM in the early 1980's, and they extended it
there. Some time in the later 1980's the several strains of csound
spin-offs were merged into BICSF, which is still used in computer music
circles and offers several advantages over simpler systems such as the
NeXT/SPARC or even lower forms of sound file life.
In an effort to offer interoperability between the BICSF and NeXT/SPARC
systems, Paul Lansky at Princeton (the author of "cmix" tool kit, the
best thing since velcro if you ask me), altered the BICSF header so
that the first 28 bytes "just happen" to be identical to the NeXT/SPARC
header. The "dataLocation" offset is set to 1024 (or a multiple
thereof) to allow a large header. What comes between the "standard"
28-byte header and the sound may then include the additional
information described below. I have further extended this to allow
more detailed annotation of sound files. I needed this because I
realize computer music compositions that typically involve several
thousand source files amounting to several gigabytes, and required very
flexible and scalable tools. I interface to these formats with both C-
and Smalltalk-language programs (most of which are in the public
domain). I will refer to this extended BICSF format as "BICSF" below.
THE BICSF SOUND FILE SYSTEM
Three topics are of interest in the BICSF system: the sound file header
structure, the sound file storage system, and the utilities the system
provides for sound file manipulation. I will discuss each of these
in turn.
All but the most Neanderthal sound file formats include some sort of
file header describing the sample rate, sample format, and other
relevant data. The flexibility of this data structure can have a large
influence on the power of the tools that one can build to manipulate
sound files. Modern multimedia and high-quality audio applications
really demand an easily extensible, scalable sound file format.
Going beyond the basic fields of a typical sound file header (e.g., the
NeXT/SPARC structure described in Appendix 2 of your FAQ), at least
three types of information should be stored with sound files: (1) an
ASCII text comment describing the sound file's contents; (2) the
maximum amplitude per channel (with the frame index where it appears);
and (3) a collection of named cue points in the sound file. Other
useful information that might be included in the header are the pitch
(scalar or vector), a transcription of the spoken text of a sound, the
envelope (an array of integer or floating-point values), etc. Further,
processing-method-specific, features such as the names of compression
algorithms, noise gate thresholds, or other file names (for the case of
a "virtual" sound file, described next), are also found.
As an example, below is a print out of an extended BICSF sound file
header taken from the Smalltalk-based MODE tool kit (see references).
The lines in this dump correspond to the fields of the C-data structure
or the instance variables of a Smalltalk class description. Note that
strings are enclosed in single-quotes, and that hash-marks (#)
intriduce symbolic names in Smalltalk.
name: 'snd/AllGatesAreOpen/Michi_1/slower_c/4a.snd'
rate: 44100.0
channels: 1
format: #linear16Bit
duration: 1.42367 sec
maxAmp: Dictionary (#'1'->10700->23345)
size: 126592 bytes
modified: 93 Apr 25 5:05:22 pm
text: 'droem och vaka'
comment: 'Transposed down about a minor third and slowed down by 35%'
cueList: Dictionary (#droem->(271 to: 29740),
#och->(31815 to: 41035),
#vaka->(41036 to: 62755))
script: 'pv 44100 1024 8192 128 173 0.82 0 0 -i'
parent: 'snd/AllGatesAreOpen/Michi_1/src/4a.snd'
envelope: (an array of 1024 integers)
The maximum amplitude field(s), which are printed above as Smalltalk
dictionaries, store the channel number, the sample frame at which the
value occurred, and the maximum sample's value, i.e., the file above
has one channel whose max. is 23345 at sample frame 10700). The cue
fields have symbolic names, and their values are sample intervals,
i.e., the word "och" ("and" in Swedish) begins at sample frame 31815
and ends at 41035. It is possible to have a sound file that has no
samples of its own, but only cue points into another sound file, a
"virtual sound file." The virtual sound file can include either a file
name and sample range, or a file name and a cue name.
[Implementation detail for C hackers] These additional fields can come
in any order and number and have variable lengths, so they are stored
in the header with a key (an integer that is #defined somewhere), a
length, and the data they hold onto.
In the csound and CCSS systems, the header also included disk cylinder
pointers, so that it could be stored separately from the sample data,
such as on a normal UNIX file system. More recent implementations have
the header followed immediately by the contiguous sample data, though
this has both advantages and disadvantages. A non-contiguous,
chunk-oriented format might be more flexible. There is still a debate
in the computer music and audio DSP community as to whether this is
necessary or desirable. On the one hand, the Berkeley file system and
its descendents can support partitions with large block sizes, thereby
enabling the high throughput required for real-time performance of
(e.g.,) quadrophonic 16-bit files ar 48 kHz (a frequently-used format).
On the other hand, as mentioned in the BICSF document below, "There are
several reasons to segregate soundfiles from regular UNIX files. [...]
You do not want realtime sound I/O to be in competition with
timesharing I/O. Expect an increase of up to 50% for having a separate
disk and controller for sound."
There are several interesting other features of extended-BICSF headers,
but this introduction should serve to heighten readers' awareness of
what is possible, and hopefully motivate the development of such
facilities based on other popular formats such as AIFF.
The utilities that are part of BICSF mirror the UNIX file manipulation
shell commands, but generally have "sf" appended to their names. The
user has a "current sound file directory" that is distinct from his or
her UNIX current working directory. In modern versions of BICSF, where
sound files are often stored as regular UNIX files, many of these (such
as "cpsf" or "rmsf"), are not needed. Others, such as "lsf," "fromsnd,"
and "tosnd" (previously called "sndin" and "sndout"), are still
generally needed, and are often given hideous and unnecessarily unclear
names such as "sndinfo." Several utilities exist that accept a variety
of sound file formats, such as the SGI Indigo machine's sound tools
that can process either AIFF or NeXT/SPARC files. (Perhaps we should
build "SOX" into our play programs so we don't have to use it
explicitly.)
AVAILABILITY
For more information on getting the CARL software distribution, contact
the center's director, F. Richard Moore (frm@ucsd.edu) or Susan Fichera
(sfl@sdcarl.ucsd.edu).
Paul Lansky's cmix tools are available via ftp from the directory
pub/music on the server princeton.edu.
The MODE Smalltalk tools are available via ftp from the directory
pub/st80 on the server ccrma-ftp.stanford.edu.
REFERENCES
Anyone performing sound I/O on a time-sharing system (like UNIX) should
be referred to Susan Fichera's excellent discussion of the issues
involved in real-time I/O in these real-time-hostile environments. Her
article is: "Machine Tongues XIII: Real-Time Audio Conversion under a
Time-Sharing Operating System" and appeared in "Computer Music Journal"
15(3):27-40 (Fall, 1991).
F. Richard (Dick) Moore's "The Elements of Computer Music" is highly
recommended as a general introduction to CM and digital audio signal
processing. It teaches his cmusic sound compiler language. It appeared
in 1990 from Prentice-Hall books.
My own MODE (Musical Object Development Environment) was described in
detail in the article "The Interim DynaPiano: An Integrated Computer
Tool and Instrument for Composers" in "Computer Music Journal"
16(3):73-91 (Fall, 1992).
A good introduction to software sound synthesis that also addresses
sound file management issues is "Machine Tongues XV: Three Packages for
Software Sound Synthesis" by yours truly in "Computer Music Journal"
17(2):23-54 (Summer, 1993). This article also introduces and compares
cmusic, csound (the language), and cmix.
====================================================================
====================================================================
Stephen Travis Pope
stp@ccrma.stanford.edu (in Palo Alto), stp@sics.se (in Stockholm)
==============================================================
==============================================================
APPENDIX: BICSF Description
(written by ? around 1988, included here unedited) (available by ftp
from the file pub/st80/mode/doc/BICSF.t on ccrma-ftp.stanford.edu)
BICSF Berkeley/IRCAM/CARL Sound Filesystem
ABSTRACT
BICSF is a collection of programs which
implement a filesystem for digital audio applica-
tions running under Berkeley UNIX. This document
gives an overview and describes the installation
procedure.
CREDITS
Contributors to this suite of programs are numerous, but the
main outlines of the system are due to the work of
+ Marshall Kirk McKusick, William N. Joy, Samuel J.
Leffler, Robert S. Fabry for the creation of the Berke-
ley Fast Filesystem,
+ Gareth Loy at CARL for the prototype CARL csound(1carl)
filesystem,
+ Rob Gross and Dan Timis at IRCAM for the IRCAM sound
filesystem,
+ Brad Garton at Columbia for the Digisound-16 device
driver and associated play and record programs.
The soundfile system code here is largely that of Rob Gross
and Dan Timis of the IRCAM group. Author ascription has
been appended to the manual pages where known.
The device drivers were written by:
+ DSC-200: Rusty Wright at CARL,
+ Digisound 16: Brad Garton at Columbia Princeton,
+ Dyaxis: Susan Fichera at CARL.
THe Digisound 16 driver was updated for SUNOS4.0 by Susan Fichera.
The integration of these various sources into one package
was done by Gareth Loy and Abe Singer at CARL and CMIL.
LIST OF PROGRAMS AND ALIASES
Following is a list of programs and aliases, and brief
descriptions:
ALIASES USING STANDARD UNIX COMMANDS
catsf - concatenate soundfiles
chgrpsf - change soundfile group ownership
chmodsf - change soundfile mode
chownsf - change soundfile ownership
cpsf - copy soundfile
mkdirsf - make soundfile directory
mvsf - move a soundfile
pwdsf - print working soundfile directory
rmdirsf - remove (empty) soundfile directory
rmsf - remove soundfile (or directory tree)
BACKWARD COMPATABILITY
sndin - read from soundfile
sndout - write to soundfile
SPECIAL PROGRAMS
createsf - prepare soundfile for recording
fromsf - read from soundfile
gainsf - normalize or adjust gain of soundfile
lsf - list sound files
normsf - normalize amplitude of soundfile
pansf - pan sound file
peaksf - compute peak amplitude and record in soundfile header
querysf - print out contents of header
restorsf - restore soundfile from csound dumpsf tape
retrosf - retrograde a soundfile
scalesf - gain scale a soundfile
setsf - set or modify soundfile header parameters
sndawk - signal modification language similar to awk for soundfiles
swabsf - swap bytes of samples in soundfile
tarsf - tape archive of soundfiles
tosf - write to soundfile
transpsf - transpose pitch of soundfile
xdr - convert soundfile to Sun external data representation
PLAYBACK/RECORD/MONITOR PROGRAMS
monitor - monitor digital output of ADCs
play - play soundfile
record - record soundfile
NAMES OF PROGRAMS
In the interests of name coherency, some programs have been
renamed from their original forms at CARL, IRCAM, and
Columbia-Princeton.
PROGRAMS:
ORIGINAL RENAMED
sfcreate createsf
sndcat catsf
sndgain gainsf
sndin fromsf
sndinfo querysf
sndnorm normsf
sndout tosf
sndpan pansf
sndpeak peaksf
sndreverse retrosf
sndscale scalesf
sndset setsf
sndtransp transpsf
PLAY, RECORD, ETC:
DigiSound-16: ai{play,record,monitor,reset}
Dyaxis: dy{play,record}
DSC-200: ds{play,record}
Aliases have been created for all the original names, and
are listed along with the rest of the aliases in
./bicsf/std.sfaliases.m4.
ORGANIZATION OF SOFTWARE
Software is divided into three groups:
+ device drivers, found in subdirectory ../sys,
+ applications programs which depend upon type of con-
verters, found in subdirectories ./{ds,ai,dy}play and
./{ds,ai,dy}record,
+ soundfile manipulation and signal processing programs
(found in the rest of the directories).
BRIEF THEORY OF OPERATION
Using BICSF, one is presented with two current working
directories: one's regular UNIX current working directory
(cwd), plus the BICSF cwd, which is initialized to point to
one's home soundfile directory. Soundfiles are ordinarily
partitioned on a separate disk from other files. However,
the BICSF soundfile directory is really a standard UNIX
filesystem at bottom. Having soundfiles on separate disks
from regular UNIX disks avoids competition for head movement
with regular UNIX processes. It is also advisable where
possible to have a separate disk controller for soundfile
disks to improve throughput for high sampling rates.
There are several reasons to segregate soundfiles from regu-
lar UNIX files.
+ Conventional wisdom is that the block/fragment size of
the soundfile partitions should be set to their maximum
(currently 8K blocks and 8K fragments). This is desir-
able for maximum disk throughput. The bigger the
blocks, the more efficient the disk I/O can be. But
UNIX files tend to favor smaller granularization, since
there tend to be more of them, and they tend to be
small. It is more common to have UNIX partitions set
to 4k/512 to allow more effective filling of the disk.
Thus, the two types of files demand different treatment
to optimize space (for UNIX files) and speed (for BICSF
files).
+ System administration: soundfiles are BIG. It is
better to have them separate from regular UNIX files so
you don't have to do huge system dumps of user's home
directory trees. In fact, at CARL, we do not dump
soundfile systems, but leave this to the users to do as
they see fit.
+ Speed of throughput: you do not want realtime sound I/O
to be in competition with timesharing I/O. Expect an
increase of up to 50% for having a separate disk and
controller for sound.
The idea of simultaneous working directories for UNIX and
BICSF filesystems overcomes the problem of having to name
long absolute pathnames to get to one's soundfiles. This
implementation (developed by Robert Gross) consists of a set
of aliases listed in ./bicsf/std.sfaliases.m4. An environ-
ment variable SFDIR contains the current working soundfile
directory. The UNIX command has a BICSF counterpart with
the following definition:
alias pwdsf '(cd $SFDIR; /bin/pwd \!*)'
Likewise, the UNIX command has this counterpart:
alias catsf '(cd $SFDIR; /cmil/bin/catsf \!*)'
cdsf, the BICSF equivalent of sets the SFDIR variable (it's
definition repays careful study). All BICSF programs must
have such an alias as shown above.
ADJUSTING FOR LOCAL CONDITIONS
You should inspect the aliases in std.sfaliases.m4 and
std.cshrc.m4 to make sure they agree with local require-
ments. In particular, check the play, record, and monitor
aliases in std.sfaliases.m4, and set them to execute the
play/record programs for the converters you are using. Also
check values of BINSF, ROOT_SFDIR, HOME_SFDIR, and SFDIR for
local conditions.
When the system is installed, these two files are run
through the UNIX macro preprocessor to resolve the location
of the programs the aliases refer to. m4 macros defining
standard pathnames for executables, manual pages, libraries,
alias files, sources, etc. must be listed in the file
config.m4, usually located in /usr/include/carl/config.m4.
See config.m4(1carl) for details.
SOURCES
Sources may be placed in one of several places depending
upon local conventions. At CARL, this path is
/carl/src/carl/src/bicsf. Elsewhere, a good place to put it
(or find it) is /`hostname`/src/import/carl/src/bicsf, where
`hostname` is the name of your machine.
The applications programs depend upon a library:
libbicsf.a. After creation, this library may be in one of
several places, depending upon local conventions. At CARL,
this path is /carl/lib/libbicsf.a. Elsewhere, a good place
to put it (or find it) is /`hostname`/lib/libbicsf.a.
It can also be put in /usr/local/lib/libbicsf.a, but as
this area is usually wiped out across upgrades of UNIX, it
is preferable to make a symbolic link, /usr/local/lib ->
/`hostname`/lib. In this way, the loader, can still find
local libraries, allowing the loader's -l flag convention:
% cc file.c -lbicsf
to succeed. Otherwise, a full path to the file could be
given:
% cc file.c /`hostname`/lib/libbicsf.a
Include files in the source code all make generic references
to include files. The Makefiles in each directory are made
from their Makefile.m4 prototypes in each source directory,
and compile the programs to look in the correct locations
for include files. These are almost universally relative
paths to the directory ./include (except for device
drivers).
HARDWARE INSTALLATION
Besides the installation of your converters, it is important
to block out appropriate partitions for BICSF soundfile par-
titions, and give them the proper block/fragment sizes.
Conventional wisdom is that you want to set them to 8K/8K
block/fragment size. The larger the block/fragment size,
the more efficient the disk can be in reading/writing data.
If possible, you do want sound on a separate physical disk,
not sharing any other UNIX function, including swapping,
etc. It's also useful if sound disks are on separate con-
trollers. CARL benchmarks are that a Digisound-16 can run
48,000Hz stereo from a Fujitsu Eagle with a single Xylogics
450 controller on a Sun-3 with a little spare bandwidth. A
second controller helps a lot. There are some files in the
device driver directories for the ai driver (for the
Digisound-16) which suggest further performance enhance-
ments.
DEVICE DRIVER INSTALLATION
Refer to the appropriate subdirectory in ../sys for the type
of converter you have and follow the directions you find
there.
SOFTWARE INSTALLATION
The code is installed using standard CARL Software conven-
tions. If this code is being installed as part of the CARL
Software Distribution, the process should be mostly
automatic, save for the installation of the device drivers.
Refer to the instructions for the Distribution, but all that
need be done is to first say
make
then make install
and finally make clean
To install standalone, proceed as follows:
First, you need a copy of libcarl.a, from the CARL software
distribution to compile some routines, so don't bother
unless you have one elsewhere, or are willing to do wri-
tearounds (which wouldn't be too difficult) for the missing
routines.
Edit the file ./include/config.m4, which contains default
and built-in pathnames for programs. For standalone instal-
lation, the most important are m4SNDFILESYSTEM, m4INCLUDE,
m4DESTDIR, and m4MANDIR.
Then execute the file ./Makefirst as follows:
% make -f Makefirst
This creates the subdirectory /usr/include/carl, and puts
the file ./include/config.m4 in it. It is strongly advised
that this subdirectory be used. If you want to put it some-
where else, you must edit all Makefile.m4 files in this
directory tree to point to the new directory, plus change
any C program files that make reference to
/usr/include/carl. There is a script to change the
makefiles called ./misc/fixmakefiles that you can use to
expedite this process, if necessary.
Next, say % make
which does the following steps:
+ remakes all Makefiles with correct paths,
+ installs the remaining include files in
/usr/include/carl,
+ builds the library
+ compiles application programs.
Next say % make install
which will install binaries, manual pages, and system
aliases.
Lastly, say % make clean
to remove executables and .o files.
To run off documentation, say
$ make roffall
SYSTEM ALIASES
The contents of ./bicsf/std.sfaliases must somehow be
sourced by all users when they log in. Furthermore, it is
useful to have users refer to a master copy, so that as
BICSF programs come and go, a single file only needs be
changed. At CMIL, for instance, this is done as follows.
All users have a standard .cshrc file in their home direc-
tories which contains the following line:
source /`hostname`/lib/std.cshrc
where `hostname` is either the name of the machine, or some
other well-known local path. The file std.cshrc in turn
sources /`hostname`/lib/std.sfaliases, which initializes
shell variables and establishes the system aliases for BICSF
commands.
There is a prototype .cshrc file, ./bicsf/dotcshrc, which is
provided for convenience. These should be the basis of the
.cshrc files all users have. At CARL, we have an adduser
shell script which installs new users. Part of it's task is
to copy dotcshrc to ~newuser/.cshrc.
============================ E N D ===========================